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ABSTRACT 


This  study  compares  two  methods  of  statistical  sampling  for  application  in  a 
contracting  context.  The  methods  are  compared  with  the  intent  of  demonstrating  the 
superiority  of  one  method  over  the  other  in  assisting  price  analysts  and  contract 
negotiators  in  expediting  processing  of  proposals  for  change  orders  while  maintaining 
acceptable  levels  of  risk.  The  Basket  Method  and  Stratified  Random  Sampling 
techniques  are  examined  to  determine  which  method  allows  a  more  accurate  estimate 
of  a  proposal  population  to  be  made.  The  several  populations  used  in  the  simulation 
have  errors  planted  to  represent  both  random  "honest"  mistakes  and  weighted 
"dishonest"  mistakes.  The  author  concludes  that  the  Basket  Method  has  a  more 
desirable  accuracy  pattern  than  the  Stratified  Random  Sampling  Technique. 
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I.  INTRODUCTION 


A.  PURPOSE  OF  THIS  STUDY 

The  purpose  of  this  study  is  to  examine  two  methods  of  statistical  sampling 
which  may  have  application  in  assisting  government  price  analysts  and  contract 
negotiators  in  expediting  processing  of  proposals  for  change  orders.  The  study  will 
describe  how  and  why  the  use  of  statistical  sampling  may  expedite  proposal  negotiation 
while  maintaining  acceptable  levels  of  risk,  and  will  determine  which  of  the  two 
sampling  methods  examined  provides  more  acceptable  results  under  the  given 
conditions. 

B.  NEED  FOR  THIS  STUDY 

Current  defense  acquisition  procedures  often  involve  situations  in  which 
Department  of  Defense  (DOD)  agencies  must  deal  with  a  sole  source  supplier  in 
buying  material  cr  services.  In  major  weapon  system  acquisitions  for  example,  the 
Department  of  Defense  typically  issues  a  large  number  of  change  orders  to  modify  an 
existing  contract.  A  lead  ship  or  aircraft  production  contract  may  generate  over  10.000 
change  orders.  Why  must  such  a  iarge  volume  of  change  orders  be  issued?  After  a 
prime  contract  is  awarded  and  production  begins,  design  changes  are  often  necessitated 
by  a  change  in  performance  requirements  requested  by  the  government  or  by  unforseen 
technical  problems  which  almost  always  seem  to  crop  up.  Each  design  change  requires 
a  modification  to  the  prime  contract  called  a  change  order.  In  each  case,  the 
contractor  prepares  a  proposal  reflecting  his  estimate  of  what  the  requested  change  will 
cost.  The  two  parties  (government  and  contractor)  must  then  negotiate  a  price  for 
each  change;  and,  because  the  prime  contractor  is  the  logical  one  to  incorporate  the 
requested  change,  there  is  no  competition  to  help  assure  that  the  government  receives 
the  fairest  possible  price.  The  only  mechanisms  working  to  assure  a  fair  price  for  the 
change  order  are  the  adequacy  of  the  contractor's  estimating  procedures,  the 
contractor's  inherent  honesty  and  desire  to  provide  a  good  product  at  a  fair  price,  and 
the  analysis  of  the  proposal  by  government  price  analysts. 

Federal  Acquisition  Regulations  (FAR)  require  the  government  to  analyze  each 
proposal  prior  to  negotiation  to  assure  that  the  proposal  represents  a  fair  price.  The 
analysis  and  negotiation  of  costs  for  each  proposal  is  done  by  some  cognizant 
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government  agency.  Often,  a  group  of  government  employees  have  been  assigned  to 
perform  such  functions  in  residence  at  the  contractor's  plant.  The  volume  of  work 
thus  generated  and  the  amount  of  money  involved  are  quite  substantial.  This  volume 
of  work  combined  with  a  lack  of  sufficient  numbers  of  government  analysts  leads  to 
large  backlogs  of  unprocessed  proposals.  To  perform  a  really  thorough  analysis  and 
patient  negotiation  takes  much  more  time  than  government  analysts  are  currently  able 
to  give  to  a  proposal.  If  analysts  do  try  to  take  more  time  and  be  more  thorough,  they 
fall  still  farther  behind  as  the  backlog  continues  to  grow.  Therefore,  there  is 
tremendous  pressure  on  analysts  to  expedite  their  work  even  though  it  is  generally 
recognized  that  hurried  analysis  and  negotiation  can  result  in  costly  overpayment  since 
quickness  commonly  works  against  thoroughness  and  accuracy  [Ref.  1:  pg.  2}. 

Unprocessed  proposals  can  result  in  extra  expense  for  the  contractor  too.  In 
many  situations  involving  ongoing  production  or  repair  work,  the  proposed  work  is 
begun  before  the  proposal  is  analyzed  and  negotiated  to  avoid  expensive  delay  and 
disruption  costs.  The  contractor,  however,  except  for  partial  advances  called  "progress 
payments",  is  not  paid  until  after  the  proposal  is  processed.  Because  of  this,  the 
contractor  may  have  to  borrow  working  capital  to  cover  funds  tied  up  in  the  backlog 
thus  suffering  capital  costs. 

It  is  generally  recognized  that  it  is  in  the  best  interest  of  both  parties  to  expedite 
the  processing  of  the  proposals  without  sacrificing  accuracy,  if  allowed  by  the 
regulations,  analyzing  and  negotiating  a  sample  of  proposals  selected  with  an  effective 
statistically-based  sampling  technique  could  have  these  effects.  If  a  suitable  sample  of 
proposals  were  selected  from  the  backlog  and  carefully  analyzed  and  negotiated,  the 
resulting  data  could  be  extrapolated  to  estimate  what  the  results  would  have  been  had 
every  proposal  received  the  same  treatment. 

The  reason  for  auditing  the  proposal  population  is  to  ensure  the  proposals  reflect 
costs  that  are  fair  and  reasonable.  During  an  audit  the  government  analyst  will  find 
that  one  of  three  possible  conditions  exists.  First,  the  government  may  feel  that  the 
proposal  has  been  understated,  i.e.,  the  contractor's  proposed  cost  for  a  change  to  the 
contract  is  less  than  the  actual  cost  the  contractor  will  incur.  Second,  the  government 
may  conclude  that  the  contractor's  proposal  is  overstated.  Third,  the  audit  findings 
may  conclude  that  the  proposal  is  reasonable.  Any  overstatement  or  understatement  is 
considered  to  be  an  error  in  the  proposal  population.  A  sampling  technique  which 
allowed  no  sampling  error  (with  sampling  error  being  defined  as  the  chance  that  a 


sample  which  is  statistically  selected  and  evaluated  will  lead  to  the  wrong  conclusion  or 
to  an  inaccurate  projection)  would  select  a  sample  from  the  population  of  proposals 
which,  when  audited,  would  always  give  an  estimated  vaiue  for  the  population  as  a 
whole  which  was  exactly  correct,  no  matter  what  the  degree  or  distribution  of  errors  in 
the  proposal  population.  Since  sampling  errors  are  due  entirely  to  chance  and  are 
inherent  m  any  sampling  process,  we  cannot  expect  a  sample  of  'n"  proposals  from  the 
proposal  population  to  provide  an  error-free  characterization  of  the  "V  proposals  in 
the  population. 

Assuming,  then,  that  there  will  be  some  degree  of  error  in  the  prediction  of  the 
true  value  cf  the  entire  proposal  population  whenever  a  sampling  technique  is  used;  the 
behavior  cf  the  degree  of  error  must  be  predictable  and  exhibit  certain  qualities  in 
order  for  the  sampling  technique  to  be  considered  appropriate  for  the  purpose 
described  above. 

Specifically,  the  degree  of  error  should  not  be  easily  altered  by  the  distribution, 
size,  or  type  (overstatement  understatement)  of  errors  found  in  the  population.  If 
certain  patterns  of  errors  caused  the  entire  population  to  be  evaluated  as  understated 
then  the  government  would  pay  more  than  a  fair  price  for  the  changes  described  by  the 
population  of  proposals  which  contained  those  errors  [Ref.  2:  pg.  7).  Since  both  parties 
to  the  negotiation  would  have  to  enter  into  a  binding  agreement  to  abide  by  the  results 
of  using  statistical  methods,  all  aspects  of  the  sampling  and  estimation  process  must  be 
disclosed  in  advance.  With  this  necessary  advanced  knowledge,  a  shrewd  contractor 
could  carefully  seed  his  proposal  population  with  deliberate  errors  of  the  appropriate 
size,  type,  and  distribution  and  thereby  be  awarded  a  larger  payment  from  the 
government. 

Therefore,  the  desired  sampling  technique  will  not  necessarily  be  the  one  which 
results  in  the  most  accurate,  average  estimate  for  various  error  arrangements  in  the 
proposal  population.  It  will  instead  be  the  method  which  responds  least  to  variations 
in  the  arrangement  of  errors. 

C.  METHODOLOGY 

As  mentioned  previously,  the  purpose  of  this  study  is  to  examine  the  effectiveness 
of  two  sampling  techniques  that  appear  to  be  most  suitable  for  the  purpose  of 
analyzing  contract  change  orders.  The  two  sampling  techniques  to  be  studied  are 
Stratified  Random  Sampling  and  the  Basket  Method.  In  this  study,  the  two  methods 
will  be  used  to  draw  samples  from  populations  for  evaluation.  The  populations  were 


previously  used  in  a  joint  study  of  the  American  Institute  of  Certified  Public 
Accountants  and  the  American  Statistical  Association  [Ref.  3].  The  data  consist  of  two 
columns  of  values  which  represent  the  proposed,  or  book  value  of  a  contract  change 
and  the  audited  or  true  value  of  the  change.  The  populations  are  rigged  with  either 
random  or  planned  errors.  The  samples  drawn  by  the  two  methods  from  each 
population  will  be  evaluated  and  compared  to  determine  which  method  gives  a  better 
estimate  of  the  whole  population  according  to  the  goals  described  above.  Beth  the 
error  rigging  and  evaluation  steps  are  explained  further  in  the  description  of  the 
simulation. 

The  amount  of  work  associated  with  auditing  is  more  closely  correlated  to  the 
number  of  items  being  audited  than  to  the  total  dollar  value  of  all  the  items  being 
audited.  Therefore,  the  sampling  rules  of  the  two  methods  will  be  adjusted  so  that  they 
will  draw  samples  with  the  same  number  of  proposals  from  each  population.  The 
results  of  this  study  will  then  indicate  which  method  yields  the  more  desirable 
prediction  while  holding  the  cost  of  the  audit  constant. 


II.  THE  BASKET  METHOD 


A.  HISTORY 

The  "Basket  Method"  of  sample  selection  was  developed  by  Dr.  K.  T.  Wallenius. 
Professor  of  Mathematical  Sciences  at  Clemson  University.  Development  of  the 
Basket  Sampling  method  was  sponsored  by  the  Office  of  Naval  Research  and  Naval 
Material  Command  and  funded  by  the  Office  of  Naval  Research  under  its  Acquisition 
Research  program.  The  Basket  Method  was  developed  as  a  potential  tool  to  assist 
price  analysts  and  contract  negotiators  in  expediting  processing  of  proposals  for  change 
orders  when  dealing  with  a  sole  source  supplier. 

B.  DESCRIPTION 

The  name  "Basket  Method"  is  derived  from  the  manner  in  which  the  population 
is  partitioned  into  separate  groups  (baskets)  prior  to  randomly  selecting  one  of  the 
baskets  as  the  sample.  The  goal  of  partitioning  the  population  into  baskets  by  the 
basket  assignment  process  is  to  make  each  basket  a  good  representation  of  the 
population  as  a  whole.  It  must  be  stressed  at  this  point  that  "representative"  should  be 
thought  of  in  terms  o t'  bid  prices  only.1  Because  each  basket  is  representative  of  the 
population  as  a  whole,  the  spread  and  proportion  of  proposal  values  will  be  nearly 
identical  to  those  cf  the  population.  Therefore,  it  makes  no  difference  which  basket  is 
selected  to  be  audited  in  detail.  The  following  example  will  describe  the  use  of  the 
basket  method  technique. 

1.  Basket  Assignment 

Imagine  having  a  population  of  100  proposals  (N=  100)  from  which  a  10% 
sample  (n=10)  is  to  be  selected.  The  proposals  are  then  arranged  in  order  of 
decreasing  bid  price  and  numbered  accordingly;  that  is,  the  proposal  with  the  largest 
bid  price  is  number  1,  the  second  largest  number  2.  and  so  on.  The  proposals  are  now 
ready  to  be  separated  into  10  different  baskets.  Starting  with  proposals  1  through  10 
(those  with  the  largest  bid  prices),  one  proposal  is  placed  in  each  basket.  Each 

!It  is  realized  there  may  be  other  relevant  factors  besides  bid  price  that  should  be 
considered  in  the  definition  of  "representative".  For  the  purposes  of  this  paper, 
however,  it  will  suffice  to  say  that  sophisticated  software  can  quickly  balance  baskets 
for  type  of  work,  degree  of  labor  intensity,  level  of  technology,  etc.  In  short,  whatever 
characteristics  are  identified  as  potentially  important  to  the  vaiue  of  an  audit  will  be 
'balanced"  by  the  basket  method  where  possible. 
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successive  group  of  10  proposals  are  assigned,  one-per-basket.  using  the  following  rule: 
the  Ingest  unassigned  proposal  is  placed  in  the  basket  with  the  smallest  sum  of  bid  prices. 
For  the  second  group  of  10  proposals,  this  rule  results  in  pairing  proposal  11  with  10, 

12  with  9 . and  20  with  1.  Basket  subtotals  are  then  calculated  and  the  assignment 

rule  applied  to  the  third  group  of  10  proposals.  This  is  repeated  until  all  the  proposals 
have  been  assigned.  [Ref.  1:  pg.  10] 

Due  to  the  balancing  of  basket  totals  at  each  stage  of  the  basket  assignment 
process,  the  resulting  assignment  should  result  in  nearly  equal  basket  totals.  Should 
additional  balancing  be  required,  the  previously  mentioned  computer  program  can  be 
used  (via  a  swapping  algorithm)  to  bring  basket  totals  into  closer  agreement. 

2.  Estimating  Negotiated  Prices  for  Unsampled  Proposals 

After  the  baskets  are  formed,  one  is  selected  at  random  and  all  its  proposals 
are  audited  and  negotiated.  Using  the  results  of  the  sample  negotiation,  the  sample 
ratio  factor.  R.  is  computed  as  in  equation  2.1. 

R  =  Total  negotiated  price  of  sample  Total  bid  price  of  sample  (eqn  2.1) 

The  total  proposal  value  of  the  population  is  then  multiplied  by  the  sample  ratio 
factor.  R.  to  determine  the  population  audit  result.  This  value  will  be  the  estimated 
true  value  of  the  population." 


"The  sample  ratio  factor  could  also  be  applied  individually  to  each  unsampled 
proposal,  the  values  summed  and  the  total  added  to  the  sum  of  the  negotiated  values 
of  sampled  proposals.  The  result  would  be  the  same. 
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III.  STRATIFIED  RANDOM  SAMPLING 


A.  THE  CASE  FOR  STRATIFICATION 

Stratified  random  sampling  is  similar  in  many  respects  to  the  technique  of 
unrestricted  random  sampling.3  The  major  difference  is  that  the  population  is  divided 
into  two  or  more  groups  (strata),  each  of  which  is  then  sampled  separately.  The  results 
can  then  be  combined  to  give  an  estimate  of  the  total  population  value. 

The  primary  objective  of  stratification  in  auditing  is  to  reduce  the  impact  of  the 
population  variance  on  the  sampling  plan.  Basically,  a  population  of  heterogeneous 
items  (a  population  with  large  variance)  is  broken  into  two  or  more  groups  or  strata  of 
a  more  homogeneous  nature  (groups  with  small  variances).  The  total  population 
variance  is  unaffected  by  this  process.  However,  it  should  be  intuitively  clear  that 
within  each  group  so  constructed,  the  strata  variance  will  be  smaller  than  the 
population  variance.  [Ref.  4:  pg.  149J 

To  illustrate,  suppose  a  population  consists  of  seven  items-five  have  a  value  of 
SI  each,  and  two  have  a  value  of  S3  each.  The  variance  of  this  population  is  close  to 
SI.  but  by  forming  two  strata  with  the  five  items  valued  at  Si  each  in  one  stratum  and 
the  remaining  two  items  of  value  S3  in  the  other  stratum,  the  variation  of  each  stratum 
is  0.  This  reduction  in  variance  by  the  formation  of  two  strata  has  important 
implications  for  the  amount  of  sampling  error  and  the  size  of  the  sample  required.  The 
relationship  can  be  summarized  as  follows:  Given  any  population  of  size  N.  the  lower 
the  variability,  the  smaller  the  sample  size  required  to  achieve  any  given  precision4  and 
reliability'  requirements.  [Ref.  5:  pg.  12]. 

While  the  above  example  is  very  simplistic  and  hypothetical  in  nature,  it  does 
illustrate  the  fact  that  by  taking  a  relatively  heterogeneous  population  and  dividing  it 
up  into  homogeneous  groups  the  variance  of  each  group  will  be  smaller  than  that  of 

JThe  principle  involved  in  unrestricted  random  sampling  is  that  every  element  in 
the  population  should  have  an  equal  chance  of  being  included  in  the  sample.  Since 
"randomness"  is  difficult  to  achieve  without  some  kind  of  aid.  a  random  number  table 
or  a  computerized  random  number  generator  are  often  used  to  insure  random  selection. 

4The  range  within  which  the  true  answer  most  likely  falls. 

'The  likelihood  that  the  true  answer  will  fall  within  the  established  range.  It  is 
usually  expressed  as  a  percentage,  being  the  number  of  times  out  of  one  hundred  that 
the  true  answer  would  be  contained  within  the  determined  margins. 
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the  original  population.  As  a  result,  the  sample  size  required  will  be  smaller  than  if 
unrestricted  random  samples  were  taken;  or  alternatively,  the  reliability  would  be 
higher  or  the  precision  limits  narrower.  Stratification  should  therefore  be  applied  to 
heterogeneous  populations  which  can  be  divided  into  fairly  uniform  strata  on  the  basis 
of  some  criteria  that  affects  the  variable  being  studied.  Under  these  circumstances, 
stratification  usually  achieves  greater  precision  for  a  given  cost.  On  the  other  hand, 
stratification  is  unnecessary  in  homogeneous  populations  where  there  are  no  discernible 
strata  that  will  affect  the  results. 

To  use  stratified  sampling,  three  general  rules  must  be  adhered  to  [Ref.  6:  pg.  96]: 

1.  Every  element  must  belong  to  one  and  only  one  stratum. 

2.  There  must  be  a  tangible,  specifiable  difference  that  defines  and  distinguishes 
the  strata. 

3.  The  exact  number  of  elements  in  each  stratum  must  be  known. 

B.  DESCRIPTION  OF  STRATIFIED  RANDOM  SAMPLING 

Once  the  decision  has  been  made  that  stratification  would  be  beneficial  in  the 
sampling  process,  there  are  several  steps  that  must  be  taken.  These  steps  will  be  briefly 
discussed  below. 

1  Establish  the  Desired  Precision  and  Reliability 

Statistical  samples  are  evaluated  in  terms  of  "precision,"  which  is  expressed  as 
a  range  of  values,  plus  or  minus,  around  the  sample  result,  and  "reliability"  (or 
confidence),  which  is  expressed  as  the  proportion  of  such  ranges  from  all  possible 
similar  samples  of  the  same  size  that  would  include  the  actual  population  value. 

:  Ref.  7:  pg.  4J 

Basically,  the  statistical  measures  of  precision  and  reliability  have  to  do  with 
how  accurate  and  reliable  the  sampler  wants  his  sample  results  to  be.  An  example  of 
the  application  of  these  two  measures  is  helpful  in  understanding  the  concepts. 
Suppose  an  auditor  is  designing  a  statistical  test  based  on  a  desire  to  obtain  an 
estimate  of  an  audited  account  value  to  within  S  10.000.  The  S  10.000  amount  reflects 
the  auditor's  judjment  as  to  what  would  constitute  a  material  deviation  in  reported 
values.  In  other  words,  the  auditor  does  not  want  his  estimate  of  the  audited  account 
value  to  be  greater  than  SI 0.000  (either  plus  or  minus)  away  from  the  true  audited 
account  value.  Reliability  is  a  closely  related  concept.  The  auditor's  goal  is  not  only 
to  obtain  an  estimate  within  the  materiality  limit  of  S10.000  but  also  to  be  reasonably 
sure  that  this  estimate  is  sound.  Because  only  a  sample  is  observed  judamcntally  or 


statistically  in  most  audit  situations,  certainty  is  impossible.  Generallv  accepted 
auditing  standards  recognize  this  by  requiring  reasonable  assurance  rather  than 
certainty.  Reliability  is  the  statistical  measure  of  that  level  of  assurance  stated  as  a 
proportion.  For  example,  a  proportion  of  0.95  indicates  that  the  auditor  wishes  to 
achieve  a  95%  level  of  reliability  that  the  reported  amount  is  not  materially  different 
•  plus  or  minus  S  10.000)  from  the  audited  amount. 

Specification  of  a  probable  range  for  a  population  parameter-a  plus  or  minus 
tor  error— is  crucial  in  indicating  the  reliability  of  estimates.  This  process  involves  the 
construction  of  a  confidence  interval  for  the  population  parameter  being  estimated.  An 
in-depth  look  at  confidence  level  construction  is  beyond  the  scope  of  this  study 
however,  and  it  is  suggested  that  the  reader  consult  any  good  statistics  textbook  for  a 
detailed  discussion  of  this  topic.  For  this  study,  no  specific  precision  and  reiiabilitv 
levels  will  be  set:  the  purpose  of  this  study  being  to  compare  the  results  of  the  two 
sampling  methods  to  each  other  rather  than  to  attain  some  specific  level  of  accuracy 
and  reliability. 

2.  Designate  the  Strata  and  Strata  Boundaries 

For  all  practical  purposes,  there  is  currently  no  existing  way  to  select  the 
optimal  number  of  strata  or  the  strata  boundaries  [Ref.  4:  pg.  15S].  Useful  rules  do 
exist,  however.  Ideally,  the  auditor  prefers  to  base  stratification  decisions  or.  the 
specific  variable  of  interest.  In  most  audit  applications,  the  variable  of  interest  is  the 
number  of  audited  account  values.  Tiiere  is  a  problem  here,  however,  in  that  the 
number  of  audited  account  values  is  net  actually  known  until  after  sampling. 
Fortunately,  a  good  substitute  for  audited  account  values- reported  account 
values*  book  values)— is  usually  available.  The  auditor  generally  expects  a  reasonable 
high  correlation  between  the  available  reported  account  values  and  the  obtained  audit 
account  values,  and  can  be  reasonably  confident  about  basing  stratification  decisions 
on  the  available  unaudited  reported  account  values.  However,  this  docs  limit  the 
benefits  of  stratification  in  that,  all  other  factors  being  equal,  unless  the  correlation 
between  reported  and  audited  account  values  is  perfect,  the  errors  introduced  bv  a 
particular  audited  account  value  belonging  to  different  strata  than  the  related  reported 
account  values  will  eventually  negate  any  further  benefits  that  can  be  obtained  by  the 
audit:  ::i  of  new  strata. 

It  is  probably  somewhat  clear  by  now  that,  from  a  practical  viewpoint,  'he 
identification  of  strata  is  a  heuristic  process  (a  sort  of  educated  guess).  In  an.  auditing 
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context,  the  approach  that  is  most  likely  to  be  beneficial  is  to  obtain  some  idea  of  the 
underlying  character  and  distributional  properties  of  the  population  of  reported 
account  values  tbock  values).  This  can  be  done  manually,  but  the  results  are  much 
more  meaningful  when  a  computer  can  be  utilized.  The  various  output  obtainable 
from  a  computer,  along  with  a  basic  understanding  of  the  data,  may  enable  the  auditor 
to  subjectively  select  strata  of  a  reasonable  nature.  In  some  cases,  the  data  may  lend 
themselves  to  obvious  strata  divisions,  but  in  most  situations  this  will  probably  not  be 
the  case. 

Even  if  there  are  a  certain  number  of  obvious  strata,  say  two,  there  are  further 
questions  to  be  asked.  For  example,  if  the  use  of  two  strata  contribute  to  a  substantial 
decline  in  the  population  variance,  one  might  reasonably  ask,  "If  two  strata  gave  good 
results  in  reducing  variance,  wouldn't  the  use  of  four  strata  give  results  that  are  twice 
as  good The  answer  is.  although  an  increase  to  four  strata  might  also  be  beneficial, 
it  would  probably  not  lead  to  as  large  a  reduction  in  the  variance  estimate.  In  fact, 
such  diminishing  returns  are  observed  as  the  number  of  strata  increases.  The  first 
doubling  of  strata--from  one  to  two-can.  produce  variance  reductions  of  as  much  as 
60%  cr  “0%  [Ref.  4:  pg.  159].  However,  a  second  and  third  doubling  tend  to  curtail 
the  incremental  reductions  to  about  25%  [Ref.  4:  pg.  159).  Therefore,  there  is  seme 
point  at  which  the  addition  of  more  strata  will  no  longer  be  useful  in  reducing  variance 
estimates,  and  may  in  fact  increase  variance.  The  only  practical  way  to  establish  the 
limits  of  strata  benefit  is  by  computer  simulation.  As  a  general  rule.  5  to  O'1  strata 
usually  account  (.depending  cn  the  particular  population,  of  course)  for  most  cl'  the 
available  variance  reduction. 

Given  the  number  of  strata,  the  auditor  must  then  determine  how  and  where 
:c  set  strata  boundaries.  Ideally,  strata  boundaries  should  be  established  on  the  basis 
of  audited  account  values,  as  before.  But  when  these  amounts  are  no:  available, 
reported  (book)  account  values  are  commomiy  used  as  the  basis  for  setting  most 
boundary  values.  This  substitution  will  work  weli  if  reported  account  values  and 
audited  account  values  are  closely  correlated. 

Strata  boundaries  might  be  set  using  the  equal  dollar  value  per  strata  rule. 
which,  as  the  name  implies,  means  arranging  the  strata  boundaries  such  that  each 
strata  has  approximately  the  same  dollar  value;  cr.  boundaries  might  be  established 
based  on  the  equal  variance  rule  where  each  strata  has  approximately  the  same  variance 
measure. 
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Another  rule,  sometimes  referred  to  as  the  O-SL'M  or  Ci'Si'M  rule, 
establishes  the  strata  boundaries  by  first  creating  a  frequency  distribution  of  the 
recorded  (book.)  account  values.  The  square  root  of  the  frequency  of  recorded  account 
values  in  each  category  is  then  computed  and  summed  and  the  resulting  total  is  divided 
by  the  desired  number  of  strata.  The  auditor  attempts  to  create  strata  by  accumulating 
the  squared  frequency  measures  in  sequence  until  the  cumulated  sum  (Cl'SL'M)  is 
approximately  equal  to  the  total  accumulation  divided  by  the  number  of  strata.  The 
next  strata  is  then  composed  of  the  next  grouping  in  the  sequence  such  that  the 
CL'SL'M  is  approximately  equal  to  twice  the  total  accumulation  divided  by  the  number 
of  strata.  [Ref.  4:  pg.  161] 

For  this  study,  the  population  will  be  divided  into  10  strata  based  on  the  book 
value  amount  of  the  audit  unit.  Stratification  by  book  amount  is  helpful  when  the 
book  amounts  of  the  audit  units  are  related  to  their  audit  values  [Ref.  3:  pg.  The 
choice  of  10  strata  was  made  to  facilitate  comparison  with  the  Basket  Method  in  that 
10  "baskets '  will  be  used  when  applying  the  Basket  Method  in  this  study. 

Strata  boundaries  will  be  set  using  the  equal  dollar  value  per  strata  rule. 
Again,  this  rule  is  used  to  facilitate  comparison  with  the  Basket  Method  where  basket 
totals  are  nearly  equal  due  to  the  unique  basket  assignment  process.  It  may  seem  that 
if  the  equal  dollar  value  per  strata"  rule  is  used  that,  conceptually,  there  is  no 
difference  between  the  strata"  formed  under  Stratified  Random  Sampling  and  tine 
baskets"  formed  using  the  Basket  Method.  There  is  in  fact  a  significant  difference  that 
stems  from  the  distinctive  ways  in  which  the  strata  and  baskets  are  formulated.  Under 
the  Basket  Method,  the  population  is  partitioned  into  baskets  in  such  a  way  that  each 
basket  w;u  have  approximately  equal  dollar  value  and  contain  approximately  the  same 
number  cf  individual  elements.  Under  the  Stratified  Random  Sampling  Method,  strata 
are  also  partitioned  so  that  they  contain  approximately  equal  dollar  value  but  the 
number  of  individual  elements  in  each  strata  may  vary  drastically. 

3.  Sample  Size  Determination  and  Allocation 

Two  methods  are  generally  used  to  allocate  a  total  sample  to  individual  strata 
Ref.  6:  pg.  9'j.  One  method  is  known  is  proportioned  allocation.  In  this  method,  the 
percentage  of  the  sample  allocated  to  each  stratum  is  the  same  as  the  percentage  cl' the 
total  r  ■  mnet.cn  accounted  for  b\  that  stratum.  That  is. 
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where  a  represents  the  sample  size  for  the  ith  stratum,  n  the  total  sample  size.  V  the 
number  of  population  items  in  the  ith  stratum,  and  N  the  total  population  size. 

A  generally  more  effective  method,  however,  is  optimal  allocation.  Optimal 
allocation  allocates  the  total  sample  to  the  individual  stratum  on  the  basis  of  the 
"relative"  stratum  size.  N,  and  the  stratum  standard  deviation,  SD. 

n.  =  n  x  X  SD  LN  SD.  (eqn  3.2) 

In  equation  3.2.  SDj  represents  the  standard  deviation  of  stratum  i.  All  other  variables 
are  the  same  as  in  equation  3.1. 

Although  the  optimal  allocation  method  is  generally  more  effective,  the 
proportional  allocation  method  will  be  utilized  in  this  study.  Proportional  allocation 
will  give  more  meaningful  results  (for  comparison  with  the  Basket  Method)  for  this 
investigation  given  that  strata  boundaries  are  being  set  using  the  "equal  dollar  value 
per  strata  ruie."  If  optimal  allocation  were  used,  in  two  strata  the  sample  sizes 
calculated  using  equation  3.2  would  be  greater  than  the  total  number  of  elements  in  the 
strata.  If  this  were  to  happen  in  a  real  world  sampling  situation,  each  affected  strata 
sample  size  would  be  set  equal  to  its  population  size  and  sample  sizes  would  be 
recalculated  for  the  remaining  strata.  The  saturated  strata  would  then  be  audited  1 00 
percent.  To  do  this  for  this  study  would  not  facilitate  comparison  of  the  two  methods 
of  sampling  under  "like"  circumstances. 

The  sample  size  computations  depend  on  whether  the  optimal  or  proportional 
allocation  method  is  used.  There  are  equations  to  be  used  for  each  method  in 
calculating  the  appropriate  sample  size  required  to  achieve  a  stated  level  of  precision 
and  reliability.  The  equations  will  net  be  enumerated  here  because  sample  size 
requirement  calculations  are  not  required  to  be  made  for  the  purposes  of  this  study. 
This  is  because  in  actual  applications  where  the  true  value  of  the  population  ,s  not 
known  the  only  way  to  be  reasonably  certain  that  one's  results  are  valid  ;s  by 
complying  with  rules  which  will  tie  the  audit  to  statistical  theory.  The  sampling  rules 
fer  Stratified  Random  Sampling  arc  designed  to  do  just  that,  so  that  the  auditor  who 
follows  the  sampling  procedures  will  be  able  to  determine  the  extent  of  the  audit 
required  to  achieve  the  desired  level  of  certainty. 

In  this  study  the  true  values  of  rhe  proposals  are  known,  as  are  the  s ,/e  and 
distribution  of  errors,  and  the  Stratified  Random  Sampling  method  is  not  being 


compared  to  its  theoretical  limits,  but  to  a  second  method  to  determine  which  of  the 
two  is  the  more  desirable  in  a  certain  case.  The  sample  size  is  this  study  will  be  chosen 
arbitrarily,  and  is  further  described  in  the  Description  of  Simulation  section. 

4.  Select  a  Random  Sample  of  Size  n  ;  From  the  Strata 

5.  Calculate  the  Mean  of  Each  Stratum  Based  On  n  ■  for  each  Stratum 

6.  Calculate  the  Estimated  Audited  Population  Total 

This  calculation  involves  taking  the  mean  of  each  stratum  (derived  in  step  5 
above),  multiplying  it  by  the  total  number  of  items  in  the  stratum  and  then  summing 
the  results.  This  gives  the  estimated  audited  population  total  which  can  be 
mathematically  represented  as  follows: 


I  x.N. 


(eqn  3.3) 


where  x;  represents  the  mean  of  stratum  i.  N.,  the  total  number  of  items  within  stratum 
i.  and  L  x.N.  the  sum  of  (  x.N.). 

7.  Check  Reliability  of  the  Estimated  Audited  Population  Total 

This  step  involves  concluding  that  one  is  certain  at  the  reliability  specified  in 
Step  1  that  the  true  book  value  is  within  the  estimated  audited  population  total  nlus-or- 
nunus  the  achieved  precision.0 


'There  is  a  formula  which  can  be  used  to  calculate  the  achieved  precision  1  he 
achieved  precision  should  aiwav s  be  less  than  or  equal  to  the  desired  or  uc^cptar.c 
rreo’.on.  if  the  achieved  precision  is  greater  than  the  acceptable  precision,  the  '  mirie 
'i/o  i'  ur.suilh.ien:  because  the  precision  limit  >  too  wide  If  th.s  v. ere  the  m  e  h.e 
'ample  'i/e  would  have  tc  be  increased 
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IV.  DESCRIPTION  OF  SIMULATION 


A.  DERIVING  COMPARABLE  RESULTS 

As  mentioned  previously  in  this  paper,  the  desired  sampling  technique  is  not 
necessarily  the  one  that  results  in  the  most  accurate,  average  estimate  for  proposal 
populations  with  varying  error  arrangements.  A  more  important  characteristic  of  the 
desired  method  will  be  that  it  responds  least  to  variations  in  the  arrangement  of  errors. 
In  other  words,  it  will  be  the  method  which  is  more  consistent  in  its  predictions  over 
various  error  arrangements  and  patterns.  Since  it  is  the  consistency  of  the  drawn 
sampie  which  is  of  interest  in  this  investigation,  the  sample  selection  process  of  both 
the  Basket  Method  and  the  Stratified  Random  Sampling  Method  will  be  used  to  draw 
samples  from  the  same  populations.  The  samples  will  then  be  evaluated  according  to 
the  basket  method,  which  will  give  an  estimate  of  the  true  value  of  the  population,  to 
see  how  well  each  method's  sample  refected  the  value  of  the  population. 

The  rules  of  the  basket  method  will  create  the  identical  set  of  baskets  from  a 
given  population  every  time  they  are  applied.  Therefore,  the  "baskets"  of  the  basket 
method  can  easily  be  evaluated  by  a  complete  review  of  the  specific  results.  The  rules 
of  the  Stratified  Random  Sampling  technique  also  provide  a  finite  number  of  samples, 
but  that  number  is  significantly  greater  than  the  number  of  different  baskets. 

To  evaluate  the  sample  drawn  with  the  Stratified  Random  Sampling  rules,  the 
sample  is  treated  us  if  it  were  a  basket.  Then,  the  sample  is  evaluated  using  the  basket 
metr.cd  evaluation  technique;  that  is,  all  of  the  sample's  resident  proposals  are  audited 
and  their  true  value  is  divided  by  their  proposal  value.  The  resulting  factor  is 
multiplied  against  the  population  proposal  total  to  determine  the  best  estimate  of  the 
true  total  value  of  the  proposal  population. 

B.  CREATION  OF  THE  TEST  POPULATIONS 

Using  the  general  purpose  statistical  computing  system  Mimtab.  the  original 
popuiat.on  was  seeded  with  errors  at  a  5%  and  10%  rate  of  occurrence  in  a  rand,  m 
d\s'r;  wr. on  The  *%  error  population  i  population  A  )  was  then  skewed  to  form  two 
add. t:<* run  test  populations.  One  •  population  B*  had  its  errors  skewed  strong!',  to  its 
higher  ••  alued  proposals,  and  the  ether  ’population  O  to  its  lower  valued  prop  sv.ls 
1  lie  t  'tai  dollar  amount  of  error  and  number  of  overstated  proposals  re  mu. tied 
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constant  during  the  skewing  procedure.  All  populations  were  created  in  both  a 
"dishonest"  version  which  had  overstatements  only  and  are  named  with  single  letters 
(populations  A.B.C.D,  and  E),  and  in  an  "honest"  version  with  both  overstatements 
and  understatements  named  by  double  letters  (AA.BB.CC.DD,  and  EE).  Except  for 
the  sign  on  each  error,  the  single  letter  named  populations  are  identical  to  their  double 
letter  named  counterparts.  Therefore,  populations  AA,  BB,  and  CC  differ  from  their 
single  lettered  counterparts  only  in  the  fact  that  they  contain  both  errors  of 
overstatement  and  understatement.  The  populations  with  10%  errors  differ  in  that  E 
and  EE.  while  containing  the  same  number  of  errors  in  the  same  distribution  and  sign 
as  D  and  DD  respectively,  have  errors  of  much  larger  magnitude,  so  that  the  sum  of 
the  dollar  value  of  the  errors  make  up  10%  of  the  population  in  E  and  EE  but  only 
1 0  o  of  the  population  in  D  and  DD.  The  populations  are  described  in  Table  1. 


TABLE  1 

POPULATION  DESCRIPTION 


NAME 

A 

B 

C 

D 

E 

Population 

S.300 

8.300 

8,300 

8,300 

8.300 

Erroneous  Proposals 

5% 

5% 

Cy» 

o 

o ' 

10% 

10% 

Z  S  Errors  Z  $  Proposals 

9% 

9% 

9% 

1% 

1  O’  n 

Types  of  Errors 

+■ 

- 

+- 

+ 

4- 

Skew  inone.  high,  or  lew) 

N 

H 

L 

N 

N 

NAME 

AA 

BB 

CC 

DD 

EF. 

Population 

8.300 

8.300 

8.300 

8.300 

S.300 

Erroneous  Proposals 

5% 

5% 

5% 

10% 

10".. 

Z  >  Errors  —  S  Proposals 

9% 

9% 

9% 

1% 

lir.) 

Types  of  Errors 

-  - 

-  - 

4-  . 

-  - 

-  . 

Skew  (none,  high,  or  low) 

N 

H 

L 

N 

N 

C.  SIMULATION  EXECUTION 

Simulations  were  run  on  all  populations  using 

both  a  Basket  Method 

e\  i luat :i 

ndem  Sampl 

ir.g  procedure  which 

was  dene 

manual 

:h:r.  Mm.tab.  The  Basket  Method  program  utilized  was  written  by  Lieutenant  James 
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P.  Tortorelli  for  use  in  his  thesis  at  the  Naval  Postgraduate  School  [Ref.  2:pg.  1SJ. 
This  program,  written  in  Waterloo  BASIC,  is  listed  in  Appendix  A.  The  Stratified 
Random  Sampling  simulation  process  is  detailed  in  Appendix  B  and  consists  of  the 
following  basic  steps.  Each  step  is  referenced  by  line  number  to  its  actual  application 
in  Appendix  B: 

1 .  Stratify  the  Population 
(Appendix  B  line  numbers  6  -  45) 

2.  Allocate  the  Total  Sample  to  the  Strata 
(Line  numbers  4S  -  SO) 

3.  Select  a  Random  Sample  From  Each  Strata 
(Line  numbers  SI  -  130) 

4.  Calculate  the  "Book  Value"  sum  for  Audited  Items 

(Line  numbers  131  -  133) 

5.  Calculate  the  "Audit  Value"  sum  for  Audited  Items 
(Line  numbers  134  -  136) 

6.  Calculate  the  Correction  Factor 
(Line  numbers  137  -  139) 

Calculate  the  Predicted  Population  Audit  Total 
(Line  numbers  140  -  142) 

S.  Calculate  the  Percent  Error 
(Line  numbers  143  -  148) 

As  mentioned  earlier,  ten  baskets  were  arbitrarily  chosen  for  the  Basket 
Method:  this  resulted  in  830  proposals  per  basket.  Ten  strata  were  then  chosen  for  the 
Stratified  Random  Sampling  Method  with  a  total  sample  size  of  S30  to  be  selected. 
Strata  boundaries  were  set  using  the  "equal  dollar  value  per  strata"  rule;  therefore,  each 
strata  has  approximately  the  same  total  dollar  amount  contained  within  it.  This  rule 
was  used  because  it  allows  better  comparison  with  the  Basket  Method  in  that  each 
'basket"  formed  under  the  Basket  Method  sample  selection  process  has  nearly  equal 
dollar  basket  totals.  Ten  trials  were  run  using  the  Stratified  Random  Sampling 
selection  method. 

The  ten  audit  results  for  each  sample  selected  by  the  two  methods  were  then 
divided  by  their  respective  proposal  sums  to  derive  the  correction  factors  as  follows. 


F  =  Total  audit  sum  of  sample  Total  proposal  sum  of  sample 


(eqn  4  1  j 


These  correction  factors  were  multiplied  by  the  sum  of  all  proposals  to  derive  the 
predicted  true  audit  total  for  the  population.  That  is, 

PTAT  =  F  x  PSL'M  (eqn  4.2) 

where  PTAT  represents  the  predicted  true  audit  total  for  the  population,  F.  the 
correction  factor,  and  PSL'M  the  sum  of  all  proposals.  The  difference  between  the 
predicted  true  audit  total  and  the  actual  audit  total  was  then  divided  by  the  sum  of  all 
proposals  to  give  a  percent  error  for  each  basket  and  trial.  This  calculation  can  be 
mathematically  represented  as  follows: 

PE  =  (PTAT  -  AAT)  PSL'M  (eqn  4.3) 

where  PE  represents  percent  error,  PTAT,  the  predicted  true  audit  total  for  the 
population,  AAT,  the  actual  audit  total  for  the  population,  and  PSL'M  the  sum  of  all 
proposals.  The  mean  percent  error  for  each  method  was  then  calculated  in  the 
following  manner: 


MPE  =  IPE  10  (eqn  4.4) 

where  MPE  represents  the  mean  percent  error  and  —PE  the  sum  of  the  individual 
percent  error  amounts  for  each  trial.  The  mean  percent  errors  for  each  method  by 
population  are  listed  in  Table  2. 


TABLE  2 

SIMULATION  RESULTS  (MEAN  %  ERROR) 


NAME 

A 

B 

C 

D 

E 

3asket  Method 

,7"0 

.5”l 

."25 

.0S9 

.629 

SRS 

-,S9S 

1.01 1 

."S3 

-,0“S 

-.659 

NAME 

AA 

BB 

cc 

DD 

EE 

Basket  Method 

.901 

.853 

1.099 

.065 

.""4 

SRS 

-.936 

-1.253 

1.36" 

-.105 

.9"! 

_  j 
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Detailed  results  are  shown  In  Appendix  C.  A  positive  percent  error  represents 
an  overestimate  and  a  negative  percent  error  represents  an  underestimate.  With  the 
exception  of  population  D.  the  Basket  Method  of  sample  selection  was  always  more 
accurate  with  overstatement  errors.  For  the  populations  with  both  overstatement  and 
understatement  errors,  the  Basket  Method  was  more  accurate  across  the  board.  The 
data  from  Table  2  are  perhaps  more  vividly  illustrated  when  expressed  in  a  different 
manner.  The  Basket  Method  errors  are  expressed  as  a  percent  of  the  Stratified 
Random  Sampling  errors  in  Table  3. 

TABLE  3 

BASKET  METHOD  ERROR  AS  A  % 

OF  STRATIFIED  RANDOM  SAMPLING  ERROR 


NAME 

A 

B 

C 

D 

E 

Percent 

85 

56 

92 

114 

95 

NAME 

AA 

BB 

CC 

DD 

EE 

Percent 

96 

68 

80 

61 

~9 

The  average  or  mean  value  (in  this  case  mean  percent  error)  in  a  set  of 
measurements  is  only  one  important  summary  figure.  It  is  also  important  to 
summarize  the  extent  to  which  values  differ  among  themselves  or  about  a  central  value. 
Or.e  of  the  most  useful  statistical  measures  of  variability  is  the  standard  deviation. 
This  measure  is  based  on  the  concept  of  deviations  from  the  mean.  The  deviation  of  a 

sample  measurement  y.  from  its  mean  y  is  defined  as  (y.-  y)  .  The  standard 

deviation  of  a  sampie  of  "n"  measurements  y , ,  y, . y  is  defined  to  be  the  square 

root  of  the  sum  of  the  squared  deviations  divided  by  (n  -  1).  The  standard  deviation,  s. 
can  be  denoted  as  follows. 

s  =  v’Kvj  -  yr  n  -  1  ieqn  4,5) 

A«  previously  mentioned,  the  measure  of  standard  deviation  may  be  used  to  show  the 
degree  of  variation  among  values  in  a  given  set  of  data,  or  it  may  be  used  to 
supplement  an  average  to  describe  a  group  of  data.  It  also  may  be  used  to  compare 
:.-e  group  of  data  with  another.  When  the  standard  deviation  is  high,  the  average 
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(mean)  is  of  less  significance  as  a  statistical  measure.  When  the  standard  deviation  is 
low.  the  value  of  the  average  is  considered  to  be  a  highly  representative  value. 

The  standard  deviations  of  the  percent  error  for  each  method  were  calculated 
using  the  data  from  Appendix  C.  The  results  are  given  in  Table  4. 


TABLE  4 

STANDARD  DEVIATION  OF  MEAN  PERCENT  ERROR 


NAME 

A 

B 

C 

D 

E 

Basket  Method 

.725 

.468 

.473 

.083 

.462 

SRS 

.698 

,5S4 

.  3  S 1 

.059 

.470 

NAME 

AA 

BB 

CC 

DD 

EE 

Basket  Method 

.539 

.796 

.800 

.055 

.601 

SRS 

.915 

1.028 

1.332 

.062 

.586 

In  looking  at  the  results  in  Table  4,  the  significance  of  the  standard  deviation  figures 
lies  not  so  much  in  whether  they  are  considered  to  be  high  or  low;  the  significance  lies 
in  the  comparable  sizes  of  the  standard  deviations  between  the  Basket  Method  and  the 
Stratified  Random  Sampling  Method.  What  this  means  is  that  the  mean  percent  errors 
for  both  methods  have  about  the  same  "representativeness"  as  far  as  being  a  good 
summary  statistic.  This  lends  more  credibility  to  the  simulation  results  as  a  basis  for 
comparison  of  the  two  methods. 


V.  SUMMARY  AND  CONCLUSIONS 


A.  resistance  to  proposal  rigging 

In  order  to  benefit  from  the  potential  time  and  labor  savings  a  sampling  system 
olTers,  the  sampling  technique  must  be  resistant  to  padding  schemes.  If  not.  a 
dishonest  contractor  has  much  to  gain  by  trying  to  selectively  pad  proposals. 
Therefore,  as  mentioned  previously,  the  primary  goal  is  not  necessarily  to  determine 
'.vhich  of  the  two  methods  is  the  most  accurate,  but  to  see  which  one  least  benefits 
attempted  padding  schemes.  When  comparing  a  method's  performance  between  tire 
single  and  double  letter  versions  of  a  population  it  can  be  seen  in  Table  2  that  both  the 
Basket  Method  (with  the  exception  of  population  D)  and  the  Stratified  Random 
Sampling  method  are  stricter  when  estimating  the  value  of  the  overstatement-only 
population  than  when  estimating  the  value  of  the  "honest  "  populations.  Therefore, 
padding  one's  contract  proposals  with  overstatements  in  random,  low.  or  high  skewed 
distributions  prior  to  submitting  them  to  either  method  for  evaluation  is  not  likely  to 
raise  the  resuiting  estimate  for  the  population,  but  is  instead  likely  to  lower  the 
estimated  value.  However,  except  for  population  D.  the  samples  drawn  with  Stratified 
Random  Sampling  allowed  the  overstatement  only  (padded)  populations  a  larger 
estimate  than  did  the  sample  drawn  with  the  Basket  Method. 

B.  EVALUATION 

Assuming  honest  contractors  are  as  likely  to  understate  as  overstate  their  costs 
and  dishonest  contractors  are  net,  honest  contractors  will  be  more  successful  'except 
for  populations  D  and  DD  under  the  Basket  Method)  than  dishonest  contractors  under 
either  of  the  sampling  methods,  Since  the  Basket  Method  allows  less  benefit  to  accrue 
to  the  dishonest  contractor  than  Stratified  Random  Sampling,  and  because  it  gives  a 
mere  accurate  estimate  in  general,  the  Basket  Method  is  judged  to  be  a  more  desirable 
sampling  method  for  the  purposes  addressed  in  this  paper. 
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AREAS  FOR  FURTHER  RESEARCH 

Some  suggestions  for  further  study  are: 

Fewer  or  more  baskets  and  strata  may  be  used. 

The  Basket  Method  can  be  compared  to  other  sampling  methods. 

A  data  set  with  much  smaller  variance  in  proposal  size  could  be  used. 
Additional  error  arrangement  strategies  can  be  developed  and  tested. 


APPENDIX  A 

BASKET  METHOD  PROGRAM  LISTING 


"01"0  REM  THIS  IS  A  PROGRAM  TO  PROCESS  DATA  USING  THE  BASKET  METHOD 
""120  REM  DATA  IS  INPUT  FROM  A  FILE.  SEPARATED  BY  COMMAS.  AND  LISTED 
t-01-Jti  REM  AS  PAIRS  OF  VALUES  FOR  A  BID.  THE  BOOK  FIRST  AND  THE 
ooi f,0  REM  AUDITED  VALUE  SECOND.  THE  PROGRAM  EXPECTS  DATPOP  PAIRS 
*’h 1 1 SO  REM  OF  VALUES.  DATA  MUST  BE  IN  DESCENDING  ORDER  BY  BOOK  VALUE. 
'">02'. -0  REM 

"0220  REM  **  DIMENSION  VARIABLES  ** 

"i: 2-0  REM 

Oo>0  DIM  ASUM(50).  BSUM(50).  ANEXT(50).  BNEXT(50) 

0u2S0  DIM  ERRORPC50').  FACTOR' 50),  ERRORA(50) 

0o3')0  REM 

-"520  REM  '■*  SET  CONSTANTS  ** 

•  Wo  REM 

"•>560  B  =  10  !  NUMBER  OF  BASKETS 

0035.0  DATPOP  =  8300  !  NUMBER  OF  DATA  PAIRS 

"0-00  BPOP=  INTI  DATPOP  B)  !  INITIATE  RUNNING  TALLY  OF  DATA  PAIRS  READ 

"0-2"  OPEN  =3,  TEST  ,  RECFM  F  LRECL  80)'.  INPUT 

('"4-0  A  TOT  =  0 

"04o0  3 TOT  =  0 

' '1)480  BPOP1  =  I 

o>  5"0  FOR  J  =  1  TO  10 

""52"  ASUM'J)  =  0 

"".540  BSUM(J)  =  0 

""56"  NEXT  J 

•«'5 SO  EES  =  0  !  SUM  OF  ERROR  SQUARES 

ELD  =  0  !  SUM  OF  BASKET  DOLLAR  SQUARES 

iiouo  REM 

"l'DO  RUM*"  ROUTINE  TO  READ  IN  DATA  ** 

"  I "40  REM 

"1060  IF  BPOP1  >  BPOP 

oi' so  GOTO  4000  !  IF  NO  MORE  DATA,  THEN  PROCESS 

"li"0  ENDIF 

■o'. 20  FOR  I  =  I  TO  B 

" i ! 40  INPUT  =3.  BNEXT(I).  ANEXT(I) 

11 1 16o  NEXT  I 

o;  IS"  BPOP1  =  BPOP1  I 

"2<"” >0  REM 

"j'2  1  REM  **  ROUTINE  TO  SORT  PARTIAL  SUMS  IN  ** 

"20-o  REM  **  BASKETS  IN  ASCENDING  ORDER  ** 

"2"6o  REM 
'•2"V  I  =  ! 


28 


j.  .  d.-  v  v;; 


02100  WHILE  I  <  B 

(12120  IF  BSIM(I)  >  BSLM(I~  1) 

02140  Cl  =  BSl  \I(Ii 

•  ■2160  C2  =  ASL  Ml) 

02 ISO  BSLM* !  1  =  BSL  Ml  I  -  1) 

022*  '0  ASL  M'  I )  =  ASL  Ml  I  D 

02220  BSL'Mil-M)  =  Cl 

<*2240  ASLM- 1  -  I  >  =  C2 

i*2260  IF  I  >  1 

022S<>  I  =  1-1 

r*220o  E  N  D I F 

02320  GOTO  2120 

02340  END  IF 

02  3  60  1=1-1 

023  So  ENDLOOP 

03000  REM 

03020  REM  **  ADD  NEXT  ROUND  TO  BASKETS  ** 

03040  REM 

03060  FOR  I  =  1  TO  B 

6*3030  BSUM( I;  =  BSLMi  I)  *  BNEXT(I) 

03100  ASL  M(I1  =  ASL  Mi  l )  -  AN  EXT*  I. 

03120  NEXT  1 
03140  GOTO  1060 
04000  REM 

04020  REM  **  ADDING  ROUTINE  ** 

04040  REM 

04060  FORI  =  1  TO  B 

04030  BTOT  =  BTOT  -  BSL  Mil) 

04100  ATOT  =  ATOT  +  ASL  Mil) 

04120  NEXT  I 

04  1  46)  FOR  I  =  1  TO  B 

04160  FACTOR)  I)  =  ASLM(I)  BSLM(I) 

"4 ISO  ERRORAi  I)  =  BTOT  -  FACTOR(I)  -  ATOT 
o4 i 35  EES  =  EES  -r  ERRORAfI)  41  ERRORAiI) 

04190  FED  =  EED  +  BSLM(i)  *  BSLM(I) 

04200  ERRORP(I)  =  100  *  ERRORA(I)  BTOT 
04220  MAE  =  MAE  +  ABS(  ERRORA(  1 0 
04240  MPE  =  MPE  +■  ABSi  ERRORPH)) 

"4260  NEXT  l 
',<4230  MAE  =  MAE  B 
'*436*0  MPE  =  MPE  B 
05<)6)0  REM 

6)5020  REM  -  PRINT  RESULTS  ** 

0 5* *46)  REM 

(*5060  PRINT  BASKET  BOOK  VALLE  AUDIT  VALLE  FACTOR  n  > ERROR  ERR  OI  P 
(.*5*080  FORM0S-  TOTAL  «==*==.«  =======.==  =.======.====  ====-  — 

*>5100  FORM  IS  =’  ==  =======.==  :s===s:.s=  =.;r===  =  .====  ======= 


05120  FOR\i:S=  MEAN  *««'*<!' 

05 NO  PRIM  L'SING  FORMuS.  BTOT.  ATOT.  ATOT  BIOT.  1* »«.**< BTOT-ATOT i  BTOI  ..V 
<)51b0  &  BTOT-ATOT 
"51  vi  fOR  I  =  I  TO  B 

j»5:»  PRINT  USING  FORM1S.  I.  BSUM(I).  ASUMlih  FACTOR!  I  *.  F.RRORP* .1  'A 
i >5220  jc  ERRORA:;) 

"'Mu  NEXT  1 

PRINT  l  SING  FORM2S.  MPE.  MAE 
052SH  FORM'S  =  DOLLARS  =========  ========= 

"53u»>  FORM4S  =  CONTRACTS  =========  =======  == 

PRINT  MEAN  AUDITED  $.  D. 

*■»»'* »2t.«  PRINT  USING  FORM3S.BTOT  B.SQR. u  B~EED i-<  BTOT  BTOTu  .  B* >  B- 1  > •> 

"NMO  PRINT  USING  FORV14S.DATPOP  B.O 
vpu6u  REM 

upuva  REM  CLEANUP  ** 
i 'Moo  REM 
up ! 20  CLOSE  =3 
u'm.u  END 


i  SI  BC  '•  use  bock’  =  .bO:!S."5. 

S  i  M I  B  >  wount  cbl  k3 
-  COt  NT  =  332s. 0 

l1’*  M  TB  •  copy  c22  cbu  cb3  cb4; 

IN  SL'BC  ••  use  book'  =  18. "6  2". 21. 

12 1  MTB  count  cb3  k4 
13,  COL  NT  =  18O-.0 

1-'  M  LB  >  copy  c22  cbO  c55  cb6; 
lb>SLBC'»  use  bock’  =  2”. 22:43. 50. 
i>'  MTS  >  count  ebb  kb 
IN  COL  NT  =  12:8.0 

Is.  MIB  >  copy  c22  cb*»  c5T  c5S: 

I'M  SL  BC  •>  use  book’  =  43.51  :~9.'_>0. 

MTB  >  count  cb~  k6 
20  COL  NT  =  698. 'O 

22 ,  MTB  ’>  copy  c22  cbO  cb9  c60; 
2.NSIBC  •  use  book  =  "3.0l:10N3b. 
24.  M  F  3  >  count  cb-)  :<7 
2N  COL  NT  =  429.00 

2''  MIB  •*  copy  c22  cbO  c61  c62: 

2',  SI  BC  »  use  beck  =  10". 36: 141.36. 
_s.  vl  I  B  *  count  cbl  kS 
23)  COUNT  =  341.00 

'"i  MTB  >  copy  c22  cbO  c63  c64; 

3 1 »  SL  BC  >  use  book’  =  141  3~:21l).S2. 
'  2 ;  MT3  >  count  c63  k9 
33 1  COL  NT  =  23S.no 

34.  MTB  >  copy  c22  cbO  ef>5  c66; 

' '  Si  BC  >  use  bock  =  219  S3  436.6b. 
N' ,  N t  T  B  >  count  c6b  k  10 
3".  (  OL  NT  =  13b. no 

3s.  MIB  •  cops  c22  oco  c('“  c6S; 

N.  SLBO  *  use  beck  =  436.66  906.3 1. 
-o  Mill  •  t  tun:  .6"  kl  1 
4  1-  (.OL  NJ  =  n ~  1  m ii i 

'  1 


->_>  M  !  B  >  copy  c22  coO  c 69  c'O: 

43  )  SL  BC  >  use  cook  =  906  32:244< 

44 >  MTB  ->  count  ct>9  k.  1 2 

45 1  COL  NT  =  29.000 

40!  M  LB  ■*  let  k ! 5  =  S30 

4'  i  \H  3  let  kl4  =  S3«)u 

4s i  M  l  B  >  let  k  1 5  =  k!4  *  (k3  kid' 

4u.  \;  n>  :■  round  kl5  kl5 

5"  I  ANSWER  =  332S.OO00 

'I;  V  13  >  let  1\15  =  kl?  *  <k3  kid) 

52>  MTB  •  round  kl5  kl5 

MM  ANSWER  =  333.0000 

MM  MTB  -  let  \ 1 6  =  kl3  *  (k4  kld> 

M'1  MTB  *  rout’d  kl6  k 1 6 

Mm  ANSWER  =  1  SI  0000 

'M  MTB  >  let  kl'  =  kl 3  *  < k5  k  14) 

‘S  •  '.17  B  >  round  kl"  kl“ 

5  A  ANSWER  =  123.O000 

cot  MTB  >  let  klS  =  kl3  ••  ik6  kI4) 
f  1 '  MTB  >  round  klS  kiS 
c2>  ANSWER  =  "O.OooO 

c.M  MTB  ■  let  kid  =  kl 3  k'  k!4> 
'■4 »  MTB  >  round  kl9  kl9 
05)  ANSWER  =  43.uuOO 

MTB  •  ie:  k2o  =  kl?  « kS  kids 
oM  MT  B  >  round  k20  1.20 
<A-  ANSWER  =  34.Oih.su 

MTB  >  let  K21  =  k  1 3  *  ik9  kid; 

~ M  !  B  -  round  k21  k21 
'I  i  ANSW  ER  =  24.0("iu 

M  1  B  •  let  .02  =  k  1 3  WklO  k!4s 
•  '•[  Mi  :>  round  k22  k22 

ANSWER  =  I'.OuOo 

*'■  M  I  U  ■  1st  1\23  =  kl 3  •  (kll  kids 
"*'•  M  IB  -  round  k23  k23 

ANSWER  =  ■’.oooo 

'v'  M  IB  >  let  k24  -  kl3  *  <kI2  kids 
"  MTB  >  round  k24  k24 
*"■  ANSWER  =  3"'h  ;() 

■si  ■  MTB  >  sample  k  1 5  ol  t.52  cM  c' 
'2 •  MTB  >  sum.  Ml  k25 
vW  SE  M  =  4025.3 

v4  /  MI  •  sum.  c~2  k2o 
s'  SIM  =  4u25  3 

’<  M  1  B  ’  sample  kl':  cM  cd  c'3  c' 

M  IB  •  sum.'c’ '  k2" 

"■  Si  M  =  4135.1 


S4)  M  i  B  '  sum  c'4  R2S 
4t>,  SIM  =  41 '5.1 

41  i  M1B  >  sample  kl"  c5a  c5n  c"'  c~T 
42 1  MTB  ■»  <um  c~5  k2,; 

4  3.  SLM  =  4tW‘J.2 

44 1  \l  I  B  >  sum  c~6  k.'o 


>' 1  SI  \ 

t  =  4i  .m) 

4'  >  \1  I  B 

'  simple  klS  c5 

4")  V  I  B 

•  sun:  s'"  k' 1 

9,  SI  \ 

-  4ns 1 > 

-  Ml  B 

sum  c~.s  In  '2 

SI  \1 

=  4i>Sl  .4 

l ■>[■  MTB  > 

sample  k  I a  c54 

!«»:•  MTB  -> 

sum  c'4  k.' ' 

I'M  SLM 

=  412'. 1 

1  "4  ■  M  LB  • 

sum  cko  k'4 

I'M  SLM 

=  254'.  1 

M  I  B  > 

sample  R2>>  cnl 

ii»",  M  TB  • 

sum  csi  k 3 5 

1  ■  'S  1  St  M 

=  404S.2 

I'M  MT3  > 

sum  cS2  k'o 

!!"■  SLM 

-  2f'"S.2 

111'  MTB  • 

san'.pie  k2I 

112  MTB  • 

sum  cS3  k3' 

i!'*  SLM 

=  4  2  <'4  ' 

114.  MTB  - 

sum.  uS4  k'S 

115 ‘  SI  M 

=  3444." 

1  !'■  >  M  I  B  •> 

sam.ple  k  2  2  eta 5 

11".  MTB  • 

sum.  e.s5  R34 

IIS-  SLM 

393". 4 

1  r-  M  I  B  > 

sum  cS6  R40 

■  2"  SLM 

=  3~5“.9 

.2! .  M  !  3  > 

sair.pie  k.2 3  c6~ 

i 22  MIR  • 

'um  cs'  k4 1 

.2-.  SLM 

=  4"ij0.3 

*24.  M  IB  - 

sum  cSS  R42 

25.  SLM 

=  4520.3 

2'v  M  F3  > 

sample  k24  ctr) 

2" -  M  LB  > 

sum  tS4  k.43 

2V'  ■  SLM 

=  -S42.4 

2  m  M  13  > 

sum  e40  R44 

St  M 

-  3 S92  9 

•;  .  M  I  B  - 

et  k45  =  k25 * 

•2'  M  I  B  • 

an;.  I-.45 

■LM' 

13  12. (a 

;4 .  M  I  B  - 

et  ■  —  ... 2 — - 

' :  1  M  IB  ’ 

"r:r.  I<4fa 

&  '  \ 


APPENDIX  C 
DETAILED  RESULTS 


POPULATION  A 


Total  Bock 

Value:  S4C9.605.72 

Total  Audct 

Value:  S3~2,255.~3 

Percent  Error  Using 

Percent  Error 

Ir  ia» 

Basket  Method 

SRS 

1 

-1.211 

.404 

T 

2.307 

.609 

- 

.769 

-  .057 

1 

.330 

.943 

C 

.330 

-2.429 

6 

• 1 .551 

.27  5 

— 

-1.433 

3 

*  •  ,i 

-1.397 

T 

-.350 

.609 

-  .  j>  3  3 

-  .  663 

:  [ear. 

.  o 

-.393 

PTP’JLAT 

ION  AA 

Total  Beck 

Value:  S4C9.605.73 

Total  A-dit  • 

<’alue  :  3410,155.73 

Percent  Error  Vs  mg 

Percent  Error 

.0  13- 

Basket  Method 

SRS 

: 

.110 

-.110 

- 

-  .  769 

-.325 

- 

1.643 

-.109 

** 

-  .939 

-1.843 

“ 

1.643 

-1.644 

: 

-1.210 

.  562 

-1.210 

-1.833 

7 

-  .230 

2.501 

1 

.  549 

.  106 

■  n 

.  549 

-.526 

Mean 

.971 

-  .926 

PCP'JLATI 

7N  B 

Total  Book  V 

'alue:  34T3.5i5.73 

Total  A-dit  V 

al-e :  £271.255. '3 

Percent  Error  Vsmg 

Percent  Error 

v/:/.  .  *■ .  i/- /,*«*,  «*, 

*  «>  -V  m\M  V?  .  .*  j, 


«V«* . 

*-*  *  J  *_*  f  *  J  tr 


-  •  .  ^vV*  . 


«  «  «  I  «  .  4 

* •*S*  »<■* *  »  *  -  n 


Trial  Basket  Method 

1  1.187 

2  -1.231 

3  .083 

4  .303 

5  1.187 

6  .088 

7  -.132 

3  -.352 

9  -.571 

13  -.571 

Mean  .571 


SRS 

.540 

-1.295 

.447 

2.016 

-.540 

.103 

1.403 

1.106 

-1.440 

-1.217 

1.011 


POPULATION  BB 


Total  Bock  Value:  3409,515.73 
ctal  Audit  Value:  S408,975.73 

Percent  Error  Using 
Trial  Basket  Method 

1  1.890 

2  .352 

3  -.527 

4  .132 

5  -.747 

6  - . 033 

■»  -2.063 

3  -.038 

9  1.390 

10  -.747 

Mean  .853 


Percent  Error  Using 
SRS 
-.092 
-3.139 
-1.682 
-.085 
1.042 
.572 
-1.415 
1.663 
2.492 
.346 
-1.253 


POPULATION  C 


Total  Book  value:  $409,605.73 
otal  Audit  Value:  $372,255.72 

Percent  Error  Using 
Trial  Basket  Method 

1  .769 

2  . 549 

3  -.110 

4  .110 

5  .549 

6  - . 549 

7  -.769 

5  -.989 


Percent  Error  Using 
SRS 
-.173 
.  56  5 
-  .  706 
-1.056 
1.132 
1.250 
-1.232 
.747 
.367 
-.557 


<M  ro 


POPULATION  E 


6  -.002 

7  -.014 

5  .059 

9  -.062 

10  .177 

Mean  .065 


-.156 

-.170 

-.145 

.076 

.126 

-.105 


Total  Book  Value:  3413,756.73 
Total  Audit  Value:  3372,256.73 

Percent  Error  Using  Percent  Error  Using 


Trial 

Basket  Method 

SRS 

1 

-.433 

.  146 

2 

-  .346 

.103 

3 

.  967 

-.944 

4 

-.362 

-  .404 

5 

-1.090 

.434 

6 

.725 

-.231 

7 

1.450 

-.601 

3 

-.121 

1.170 

9 

-  .121 

-1.279 

10 

-.121 

-1.279 

Mean 

.629 

-.659 

POPULATION  EE 


ital  Book 

Value:  3413,756.73 

al  Audit 

Value:  3417,656.73 

Percent  Error  Using 

Percent  Error 

Trial 

3asket  Method 

SRS 

.338 

1.267 

2 

-1.961 

-.375 

“3 

.322 

I. 001 

4 

-.024 

1.295 

5 

.943 

1.295 

c 

1.305 

-1.946 

7 

-1.111 

.105 

a 

-.024 

-1.471 

q 

-.749 

.459 

:o 

.459 

.500 

Mean 

.774 

.971 

38 
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